Model Selection

Low-memory inference

# Low-memory inference

Smollm 135M Instruct

A lightweight instruction fine-tuned language model optimized for mobile deployment

Large Language Model

litert-community

Openhands Lm 7b V0.1 GGUF

OpenHands LM is an open-source coding model built on Qwen Coder 2.5 Instruct 32B, which performs excellently in software engineering tasks through special fine-tuning.

Large Language Model English

Falcon E 3B Instruct

Falcon-E-3B-Instruct is an efficient language model based on a 1.58-bit architecture, optimized for edge devices, with excellent inference capabilities and low memory usage.

Large Language Model

Falcon E 1B Instruct

Falcon-E-1B-Instruct is an efficient language model based on a 1.58-bit architecture, optimized for edge devices with low memory footprint and high performance.

Large Language Model

All MiniLM L6 V2 GGUF

all-MiniLM-L6-v2 is a compact and efficient sentence embedding model based on the MiniLM architecture, suitable for sentence similarity computation and feature extraction tasks.

Text Embedding English

Meta Llama 3 8B Instruct GGUF

An IQ-DynamicGate ultra-low-bit quantization (1-2 bit) model based on Llama-3-8B-Instruct, utilizing precision-adaptive quantization technology to enhance inference accuracy while maintaining extreme memory efficiency.

Large Language Model English

Smolvlm2 2.2B Instruct

SmolVLM2-2.2B is a lightweight multimodal model designed for analyzing video content. It can process video, image, and text inputs and generate text outputs.

Transformers English

Mosaicml Mpt 7b Chat Bnb 4bit Smashed

A compressed version of the MPT-7B-Chat model provided by PrunaAI, optimized with llm-int8 technology to significantly reduce memory usage and energy consumption.

Large Language Model

Transformers Other

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase